We claim: 



1. A method for upgrading a data stream of multimedia data, said data stream 
comprising features with textual description, said method comprising including a 
set of phonetic translation hints in the data stream in addition to the textual 
description, and wherein said phonetic translation hints specify the phonetic 
transcription of parts or words of the textual description. 

2. The method according to claim 1 , wherein each of said phonetic translation 
hints is followed by a word and said phonetic transcription of said word. 

3. The method according to claim 1 or 2, wherein each of said phonetic 
translation hints with said phonetic transcription is valid for at least a portion of 
said textual description without requiring repetition of said phonetic transcription 
for each occurrence of a word, for which the phonetic transcription is given, in 
said textual description. 

4. The method according to claim 1, wherein said phonetic translation hints are 
embedded in an MPEG data stream associated with textual type descriptors. 

5. The method according to claim 4, wherein said MPEG data stream is an 
MPEG-7 data stream. 
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6. The method according to claim 1 , further comprising referring to an alphabet in 
a predetermined format for representation of phonetic transcription information. 

7. The method according to claim 6, wherein said alphabet is an international 
phonetic alphabet or SAM PA. 

8. The method according to claim 1 , wherein said phonetic translation hints 
include a limited number of phonemes. 

9. The method according to claim 8, wherein said phonemes are represented 
with a binary fixed length or variable length code. 

1 0. The method according to claim 9, wherein coding of said phonemes takes 
into account statistics of the phonemes. 

1 1 . The method according to claim 1 , further comprising storing said phonetic 
translation hints in a speech recognition system to better identify corresponding 
elements of the textual description. 

12. The method according to claim 11, wherein the phonetic translation hints 
together with the corresponding elements of the textual description are 
implemented in text-to-speech interfaces, speech recognition devices, navigation 
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systems, audio broadcast equipment or telephone applications, in which said 
textual description is used in combination with phonetic information for search or 
filtering of information. 
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